12 research outputs found

    Listener Background in L2 Speech Evaluation

    Get PDF
    Listeners are integral parts of second language (L2) oral performance assessment. However, evaluation of listeners is susceptible to listener background variables and biases. These variables and preexisting biases distort native speaker (NS) listenersā€™ perceptions of non-native speakersā€™ (NNSs) speech performance and contribute errors into their oral performance assessment. Among listener background variables, listenersā€™ first language status, the amount of exposure to different English varieties, listenersā€™ educational background, prior language teaching experience, NNSsā€™ linguistic stereotyping, and listener attitude have been investigated in the literature and assumed to exert sizable amount of variation in speakersā€™ oral proficiency true scores. To minimize listenersā€™ bias in the assessment context, listeners are provided with intensive training programs in which they are trained how to rate NNSsā€™ speech more objectively utilizing scoring rubrics. To mediate listenersā€™ bias in social contexts, the literature has provided strands of evidence in favor of structured intergroup contact programs, which are inoculations particularly devised to improve NSsā€™ attitude, thereby making them more receptive to NNSsā€™ English varieties. To enhance L2 listenersā€™ self-efficacy and foster their autonomy, L2 instructors are encouraged to emphasize explicit instruction of listening strategies

    The effects of situational contexts and occupational roles on listenersā€™ judgements on accented speech

    No full text
    Much language attitude research has demonstrated that people make biased judgements based on speakersā€™ language choice and accent. However, the influence of occupational context on listenersā€™ perceptions of accented English is largely unknown. This verbal guise study examined the extent to which academic contexts and workforce-related professional contexts affect listenersā€™ judgements of accented speech. Results revealed that simulated contexts made a significant difference in listenersā€™ perceptual judgements, with speakers perceived as significantly more comprehensible and acceptable in service-occupational roles than in academic contexts. These findings suggest that listenersā€™ speech judgements can be heavily influenced by speakersā€™ situational contexts. The study also provides evidence in support of the fluency principle, showing that listeners may evaluate accented speech more negatively if it requires more processing effort. The findings inform the domains of linguistic stereotyping and listenersā€™ attitudes towards accented speech

    Improving Mispronunciation Detection with Wav2vec2-based Momentum Pseudo-Labeling for Accentedness and Intelligibility Assessment

    Full text link
    Current leading mispronunciation detection and diagnosis (MDD) systems achieve promising performance via end-to-end phoneme recognition. One challenge of such end-to-end solutions is the scarcity of human-annotated phonemes on natural L2 speech. In this work, we leverage unlabeled L2 speech via a pseudo-labeling (PL) procedure and extend the fine-tuning approach based on pre-trained self-supervised learning (SSL) models. Specifically, we use Wav2vec 2.0 as our SSL model, and fine-tune it using original labeled L2 speech samples plus the created pseudo-labeled L2 speech samples. Our pseudo labels are dynamic and are produced by an ensemble of the online model on-the-fly, which ensures that our model is robust to pseudo label noise. We show that fine-tuning with pseudo labels achieves a 5.35% phoneme error rate reduction and 2.48% MDD F1 score improvement over a labeled-samples-only fine-tuning baseline. The proposed PL method is also shown to outperform conventional offline PL methods. Compared to the state-of-the-art MDD systems, our MDD solution produces a more accurate and consistent phonetic error diagnosis. In addition, we conduct an open test on a separate UTD-4Accents dataset, where our system recognition outputs show a strong correlation with human perception, based on accentedness and intelligibility.Comment: Accepted to Interspeech 202
    corecore